A Survey on Gender Bias in Natural Language Processing
Therefore, in this
paper, we present an overview of 304 papers on gender bias in natural language processing. We
begin with a brief outline our methodology and explore the evolution of the field in popular NLP
venues (§2). Then, we discuss different definitions of gender in society (§3). Further, we define
gender bias and sexism in general and in NLP, in particular, incorporating a discussion of their
ethical considerations (§4). Next, we gather common lexica and datasets curated for research on
gender bias (§5). Subsequently, we discuss formal definitions of gender bias (§6). Then, we discuss
methods developed for gender bias detection (§7) and mitigation (§8).
Above we have introduced gender bias and sexism as general terms. In the following, we discuss
how these biases emerge in natural language and ultimately influence many downstream tasks.
Language can be used as a substantial means of expressing gender bias. Gender biases are
translated from source data to existing algorithms that may reflect and amplify existing cultural
prejudices and inequalities by replicating human behavior and perpetuating bias
Gender bias is known to perpetuate to models and downstream tasks posing harm for the endusers Bolukbasi et al. 2016. These harms can emerge as representational and allocational harms and gender gaps.
前提として使う
Structural bias arises when the construction of sentences shows patterns
that are closely tied to the presence of gender bias. It encompasses gender generalisation (i.e.,
when a gender-neutral term is assumed to refer to a specific gender-based on some (stereotypical)
assumptions) and explicit labeling of sex. On the other hand, contextual bias
Bias Amplification. Previous research has shown that NLP models are able not only to perpetuate biases extant in language, but also to amplify them Zhao et al. 2017. In particular, Zhao et al. 2017 interpret gender bias as correlations that are potentially amplified by the model and define gender bias towards a 𝑚𝑎𝑛 for each word as:
https://gyazo.com/986d795ecd54cd0628526a6fa119b163